Dimensionality on Summarization
نویسنده
چکیده
Summarization is one of the key features of human intelligence. It plays an important role in understanding and representation. With rapid and continual expansion of texts, pictures and videos in cyberspace, automatic summarization becomes more and more desirable. Text summarization has been studied for over half century, but it is still hard to automatically generate a satisfied summary. Traditional methods process texts empirically and neglect the fundamental characteristics and principles of language use and understanding. This paper summarizes previous text summarization approaches in a multi-dimensional classification space, introduces a multi-dimensional methodology for research and development, unveils the basic characteristics and principles of language use and understanding, investigates some fundamental mechanisms of summarization, studies the dimensions and forms of representations, and proposes a multi-dimensional evaluation mechanisms. Investigation extends to the incorporation of pictures into summary and to the summarization of videos, graphs and pictures, and then reaches a general summarization framework. Further, some basic behaviors of summarization are studied in the complex space consisting of cyberspace, physical space and social space. The basic viewpoints include: (1) a representation suitable for summarization should have a core, indicated by its intention and extension; (2) summarization is an open process of various interactions, involved in various explicit and implicit citations; and, (3) the form of summary is diverse and summarization carries out through multiple dimensions.
منابع مشابه
Text Summarization as Feature Selection for Arabic Text Classification
Text classification (TC) or text categorization task is assigning a document to one or more predefined classes or categories. A common problem in TC is the high number of terms or features in document(s) to be classified (the curse of dimensionality). This problem can be solved by selecting the most important terms. In this study, an automatic text summarization is used for feature selection. S...
متن کاملDimensionality Reduction Aids Term Co-Occurrence Based Multi-Document Summarization
A key task in an extraction system for query-oriented multi-document summarisation, necessary for computing relevance and redundancy, is modelling text semantics. In the Embra system, we use a representation derived from the singular value decomposition of a term co-occurrence matrix. We present methods to show the reliability of performance improvements. We find that Embra performs better with...
متن کاملSparse Summarization of Robotic Grasp Data
In this paper we propose a new approach for learning a summarized representation of high dimensional continuous data. We apply the model to learn efficient representations of grasp data for two robotic scenarios that facilitates a compact summarization. Our technique consists of a Bayesian non-parametric model capable of encoding highdimensional data from complex distributions using a sparse su...
متن کاملA survey on Automatic Text Summarization
Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...
متن کاملMultilabel Associative Text Classification Using Summarization
This paper deals with the concern of curse of dimensionality in the Text Classification problem using Text Summarization. Classification and association rule mining can produce well-organized as well as precise classifiers than established techniques [1]. However, associative classification technique still suffers from the vast set of mined rules. Thus, this work brings in advantages of Automat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1507.00209 شماره
صفحات -
تاریخ انتشار 2015